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ABSTRACT. Given a large symmetric, positive semidefinite matrix A £ R mxm , the goal of this 
paper is to show how a numerical approximation of the entropy, given by the scalar quantity <£(A) = 
~ J2\ecr(A) ^ l°gM' where u(A) denotes the spectrum of A, can be computed in an efficient way. 
An application from quantum-optics illustrates the new algorithm. 



1. Introduction 

■ In quantum mechanics the most general description of a state is given by a Hermitean, positive 

\ semidefinite operator commonly referred to as density operator. The density operator allows to 

quantify the amount of disorder in a quantum system in terms of the (von Neumann) entropy 
lTr9l . The concept of entropy can be extended to a valid measure of non-classical correlations 
between the subsystems of an entangled state each described by its own density matrix [2, 31 [TBI . 
More specifically, focusing on real symmetric, positive semidefinite matrices A <E R mxm , i.e., 

A = A T , v T Av > Vv G M m , 

the entropy of A may be defined by 

(1) <S(A) = - £ £(A). 

\ea(A) 

Here, cr(A) c K signifies the spectrum of A, and £ is a continuous function on the real inter- 
val [0, oo ) which is given by 



(2) £ : [0, oo) -> E, x i — y 



x log (a;) if x > 
if x = 



Computing the entropy of a matrix. By the symmetry of A, the spectral theorem implies that A 
can be diagonalized by means of an orthogonal matrix Q £ M. mXm , Q 1 = Q T , i.e. 

A = QDQ T , 

where D = diag (Ai, A 2 , . . . , A m ), and Ai, A2, ■ • ■ , A m > denote the eigenvalues of A. Then, we 
may define the matrix function 

C ' IR m x m ^ ]R m x m 

induced by 10 (and denoted with the same letter) in a standard way (see, e.g., flOl ) by 

C(A) = QC(D)Q T , 

where 

C(D) = diag (r(Ai), £(A 2 ), . . . , £(A m )) . 

Furthermore, we see that 

£{A) = -tr(£(£>)) - -tr(Q T £(^4)Q). 
Moreover, since the trace of a matrix is invariant with respect to similarity we arrive at 
(3) <£(A) = -tr(£(A)). 
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At first glance, computing the entropy is possible by either applying ((TJ, i.e., by computing the 
full spectrum of A, or by using formula 10 which involves the computation of the matrix loga- 
rithm. Evidently, for large matrices, both approaches are prone to be computationally unfeasible 
due to their high degree of complexity. 

A new algorithm. The goal of this paper is to calculate the entropy of a matrix without the need 
of finding the eigenvalues of A or the necessity of computing the matrix logarithm of A explicitly. 
To this end, two key ingredients will be taken into account: 

• The function C will be approximated by a polynomial p; in so doing the term C(A) ss p(A) 
can be expressed approximately as a sum of powers of A. This avoids the computation 
of the matrix logarithm. 

• Using the relations 

<£(A) = -tr(£(A)) « -tr(p(A)), 

the entropy of A can be found approximately by computing the trace of p{A). This quan- 
tity, in turn, may be determined numerically by appropriately combining a Monte-Carlo 
procedure and a Clenshaw type scheme. In this way, the explicit computation of the 
matrix powers occurring in p(A) can be circumvented. 

Applying these ideas, we will obtain a low-complexity algorithm for the matrix entropy which is 
still able to generate accurate computational results. As a practical application we will consider 
a two-photon state entangled in frequency. More precisely, we consider large density matrices 
resulting from suitable discretizations of the related continuous operators. We will use the new 
algorithm developed in this paper to demonstrate entanglement quantification by means of the 
entropy for a large real- valued density matrix A € M. mxrn , with a matrix size of the order m = 
O(10 8 ). 

The article is organized as follows: In Section [2] we will recall a Monte-Carlo procedure pro- 
posed in [1J to compute the trace of a matrix function. Subsequently, a Chebyshev approximation 
polynomial of the function C will be derived in Section |3j together with a sharp error estimate 
with respect to the supremum norm. Furthermore, Section [4] contains the new algorithm and a 
probabilistic error analysis. Next, we present some numerical examples including a traditional 
finite element matrix and the above-mentioned quantum optics application in Section [5] Finally, 
we draw some conclusions in Section[6] 

Throughout the paper, || ■ ||2 denotes the Euclidean norm. Furthermore, tr( ) signifies the trace 
of a matrix, i.e., the sum of its diagonal entries. We note the fact that tr(A) = X^Aecr(A) ^ ^ or 
any A G R mXTO . We also notice that, for any 70 > 0, there holds that 

= - 70 tr (C (7o~^)) - log(7o)tr(A). 

The appealing property of this identity is that it allows to compute the entropy by means of 
the function C as restricted to the interval [0, A /-yo]- In particular, since we approximate £ by a 
Chebyshev polynomial, the interval of approximation can be limited from [0, maxxe a (A) A ] to the 
smaller interval [0, Jq 1 max Aecr ( A ) A], if 70 > 1. 

2. Monte-Carlo Approximation 

The following proposition, see, e.g., I61 H21 , motivates a Monte-Carlo procedure for the com- 
putation of the trace of a symmetric matrix. Throughout the paper, we let 51 = {— 1,+1}, 
and ft™ = {-l,+l} m c M m . 

Proposition 2.1. Consider a symmetric matrix A € R mxm w ith tr(A) 7^ 0. Furthermore, let X be a 
random variable that takes values —1 and 1 with probability 1 /2 each. Moreover, let u> e fl m be a vector 
ofm independent samples generated by X. Then, 

E(w T Aw) = tr(A), 



(4) <£(A) = - ]T 



7o£ ( — ) + log(7o)A 
7o 



where E denotes the expected value. 
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From a practical point of view, this result allows for the computation of a numerical approxi- 
mation of the trace of a symmetric matrix A by taking the mean of a finite number N of sample 
computations, 

1 N 

tr (A) Ki—J^ujAui, 

i=l 

where u>i G Sl m are random vectors like defined above. Thence, recalling H), we find, for any 70 > 
0, that 

JV 

(5) <S(A) « E w ^ £ fro" 1 ^) - Iog(7b)tr(A). 

In order to provide bounds for terms of the form a> T £ (~/q 1 A) oj, for u> G fi TO , we note the 
following lemma. It is based on an elementary analysis of the graph of £ 

Lemma 2.2. Let xq > 0. Then, there hold the estimates 

min(£(xo), e~ sign(e — xq)) < C(x) < max(0, C(xq)), 

for any x G [0, xo]. Here, 

{-1 ifx < 
ifx = 
1 ifx > 

is the sign function. 

We continue by deriving some upper and lower bounds for v T C(A)v. 

Corollary 2.3. Let A G R mxm be real symmetric, and positive semidefinite, and a(A) c [0, xo"/o],f or 
some xo, 70 > 0. Furthermore, consider v G f2 m . Then, the estimates 

77170 min(£(:ro), e _1 sign(e _1 — xq)) < jqV T C^/q 1 A)v < rrvyo max(0, C(xq)) 

hold true for any v G fl" 1 . 

Proof. Let v G f2 m . We choose an orthogonal matrix Q such that there holds Q T AQ = D = 
diag (Ai , A 2 , . . . , A m ) . Then, 

m 

^ T £( 7( r 1 A).; = (g T .;) T /:(7 - 1 £>)Q T 1 ; = J2(Q T ^)^ho 1 ^)- 

i=l 

Then, using the upper bound from Lemma IZ2l and the identity ||Q T u||2 = IMI2 = \frn, it follows 
that 

^ T £(7 ( 7 1 A)v < i-R&x(Q 7 £(x ))J2(Q Tv )i = mmax(0,£(a;o)). 

i=l 

The proof of the lower bound is completely analogous. □ 

Remark 2.4. Suppose that A max > is the maximal eigenvalue of A. Then, we choose xq, 70 > 
such that X070 = A max . Evidently, cr(A) c [0, xo 7 o]- Hence, by the previous Corollary l2.31 we see 
that 

Amaxmi^ 1 min(£(x ), e _1 sign(e _1 - x )) < Jqv t C(jq 1 A)v < A max mxo 1 max(0, £(.t )). 

We may ask the question of how to choose xq and 70 such that the upper and lower bound in the 
above estimates are as close as possible to each other. In other words, we seek XQ Pt > such that 
the function 

(6) d(xo) = Xq 1 [max(0, C(xq)) — min(£(a;o), e _1 sign(e _1 — Xo))] 

is minimal. It turns out that this is the case for Xq PI = 1, i.e., 70 = A max ; see Figured] 



4 



T. P. WIHLER, B. BESSIRE, AND A. STEFANOV 




FIGURE 1. Graph of the function d(xo) from © in [0, 5]. 

3. Chebyshev Approximation and Clenshaw's Algorithm 
Let us recall the Chebyshev polynomials of the first kind 

T n (x) = cos(n arccos(a;)), n > 0, 
on the reference interval I = [—1,1]. They satisfy the three term recurrence relation 

T n+1 (x) = 2xT n (x) - r„_i(2c), n > 1, 
with Tq(x) = 1, T\{x) = x. These polynomials are orthogonal with respect to the inner product 



More precisely, 



7r if m = n = 0, 

(T m ,T„) = \ 5 mn Tr 

— — it m + n > U 



where S mn is Kronecker's delta; see, e.g., (7). Then, for xq > 0, the affine transformation 



(7) 



F: [-1,1]-). [0,10], F(x) = ^(x + 1), F- 1 ^) = — x - 1, 

2 a; 



allows to define the Chebyshev polynomials {T„}„>o on an interval / = [0, xq\: 

T n =T n oF~\ xel=[0,x }. 

Proposition 3.1. Letxo > 0,andn e No- Then the function Con the interval I = [0,xq] is approximated 
by the polynomial function 



(8) 



Pn(x) = y + ^2 a k T k (x), 



where the coefficients {afc}£ =0 are given by 



(9) 
(10) 



a Q = x Q ( log ( y ) + 1 



fe=i 



xo ( fx 



3 , 



(-l) k xp 

k{k 2 -iy 



k>2. 
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Furthermore, for n>l, there holds the error estimate 

(11) ||£-Pn||oo,(0 )a! o)< 2n( ; U +1) , 

with || ■ || 00.(0, K ) denoting the supremum norm on I. 
Proof. We begin by defining the function 



C = CoF: £^^(J+l)log 



(x + 1) 



on I = [—1,1]. Using standard Fourier theory and affine scaling, the function C can be repre- 
sented by the infinite series 



£( x ) = + J2 a ^) = t + E aki ? k ° ^x*) = F ^ x 



k=l 



k=l 



with 



flfc = - 



2 f 1 C(x)f k {x) 



7T 7-1 vi - X 2 



dx. 



Then, we define the polynomial p n by truncation: 



Pn(x) 



y^q fc r fc (x). 



fe=i 



The coefficients {a fc })! =0 can be computed by employing the substitution x = cos(i), cf. (T21 - This 
implies that 

2 r - 

£(cosi) cos(fci) dt, fc > 0. 



7T 



(I 



For k = 0, 1 we find the formulas © by direct calculation. In addition, noting the identity cos t = 
h (e lt + e~ lt ) and switching to complex variables z = e lt , dz = iz dt, the formula flu) follows from 
the residual theorem. 



As for the error estimate, we notice that 1 1 T) 



fell oo,(0, Xq) 



l^fe||oo,(-l,l) 



||£-Pn||oo,(0,x ) 

Noticing the telescope sum 

^ k(k 2 - 1) 1 ^ 



< Kl = XQ 



1. Then, for n> 1, 
1 



oo,(0,a;o) 



/fc(fc 2 - 1)' 



k=n+l 

completes the proof. 



k(k 2 -l) 2 k £^ i \k(k-l) {k + l)kj 2n(n + l) 



□ 



Remark 3.2. The polynomials p n , for n — 0, 1, 2, 3, and xo = 3 are shown, together with the moduli 
of the approximation errors, in Figure E] 

The above result implies the following estimates: 

Corollary 3.3. Let A g R mxm be symmetric and positive semidefinite, with a (A) C [0, xo7o]/ for 
some x , 7o > 0. Furthermore, consider v G Q m . Then, for n > 1, it holds the bound 



ItTOCfro^AJ-p^A))*! < 



2n(n + l) 

ztfere p„ is f/xe Chebyshev approximation polynomial of the function Cfrom ((S). 
Proo/. Let Q e R™ xm be orthogonal such that Q T AQ = diag (Ai, A 2 , . . . , A m ). Then, 
v T (C(% 1 A) - Pn ( 7o - 1 A))« = (Q T ^) T (£( 7o - 1 J D) -p„(7 - 1 J D))Q T « 

= £(Q T «)?0C(7o -1 Ai) -Pntio^i))- 

i=l 
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FIGURE 2. Polynomials p n , for n = 0, 1,2,3, and xo = 3. Left: Graphs of p Tl 
Right: Approximation errors \C(x) — p n (x)\. 



Hence, since ||Q T f H2 = IMU = ^/m,we arrive at 

\v T (Zho 1 A ) - Pnho 1 A )) v \ <m max \C(- / Q 1 X l )-p n ("/o 1 X t )\. 

l<i<m 

Finally, using the error estimate (Till , yields the desired result. □ 

In practical applications, Chebyshev series can be evaluated using the Clenshaw algorithm, 
see, e.g., (5)17): Starting from functions 

K+2(x) = b n+1 (x) = 0, 

and applying affine scaling from [—1, 1] to [0, xq], see 10, we define 

k = n, n — 1, . . . , 0, 



b k (x) = a k + 2 — x - 1 b k +i(x) - b k+2 (x), 
\x J 

where {a^} are the coefficients from i9t-(TT0ll. Then, it can be shown that p n from lO can be 
represented in the form 

Pn{x) = ^(a + b (x) - b 2 {x)). 

For a symmetric matrix A 6 R rnxm the polynomial "yoPnilQ 1 A) can be computed in a similar 
way (indeed, this is possible since all terms involved consist of commuting sums of powers of A): 
Setting B n+2 = B n+1 = 0e R raXra we define a finite sequence {B k }%±$ C R mxm of matrices by 
the reverse recurrence relation 

B k =a k I + 2 \—A - I ) B k+1 - B k+2 , 



XQlO 



k = n, n — 1, 



,0, 



where J signifies the identity matrix in ] 



\ Then, 



p„( 7o J A) = - (a I + B Q - B 2 ) 



From the above relation we find for a vector v G fl m that 

4 



B k v = a k v 



Xo^O 



-AB k+ iv - 2B k+1 v - B k+2 v, 



k = n, n — 1, 



,0. 



Therefore, introducing the variable y k — B k v, we see that 

4 



y k = a k v 



2:070 



-Ay k+ i - 2y k+1 - y k 



+2, 



for k = n, n — 1, . . . , 0, starting from y n + 2 = y n +i = 0. Finally, we obtain 

v T p n (lo 1 A)v = - (ma + v T (y - 1/2)) ■ 
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Consequently, evaluating the above product essentially amounts to n matrix-vector multiplica- 
tions. 

Algorithm 3.4. Let A g R mxm fa a symmetric matrix, and v £ Q m a vector. Then, for any 70 > 0, the 
quantity r y v T p n (^Q 1 A)v, where p n , n > 1, is the polynomial from © can be computed by means of the 
following procedure: 

(1) Set y n+2 = y n+1 = e M m . 

(2) For k = n,n — 1, . . . , do 

4 

y k = a k v H Ay k+1 - 2y k+1 - y k+2 . 

(3) Output Y (ma + v T (y - Hz))- 
Here, {a k }^ =0 are the coefficients from j9l-lfT0ll. 

4. Computing the Entropy 

We now return to the idea of computing a numerical approximation of the entropy of a matrix 
by means of 10. In order to avoid the computation of the matrix logarithm, however, we will use 
the approximation (£(A) « <£(A), where, for some 70 > to be specified later, 

N 

(12) £(A) = ~ E u *»» ^o 1 ^) Ui - log( 7 o)tr(A). 

»=i 

Here p n is the approximation polynomial of degree n for C from (8), and u»j € fi" 1 are random 
vectors. More precisely, we propose the following basic algorithm: 

Algorithm 4.1. Let A e R mx ™ be a real symmetric, positive semidefinite matrix, and n,N e N. 
Furthermore, choose 70 > 0. Then: 

(1) Compute N random vectors uii £ Q m , i = 1, 2, . . . , N, with entries ±1 occurring with the same 
probability 

(2) Determine the scalars & = 7ow I T Pn(7o~ 1 A)wi usim Al?orithm \3~4\ 

(3) Output -i £^ & - log( 7 o)tr(A). 

The approximation provided by the above algorithm has two essential error sources: Firstly, 
the use of the Monte-Carlo approach (0 brings about a certain randomness, and, secondly, re- 
placing the function C by p n in lfl2)l leads to an approximation error. The latter point has been 
addressed already in Corollary 13.31 In order to deal with the issue of randomness, we provide 
a confidence interval analysis for the numerical approximation l(T2l following the approach pre- 
sented in (TJ. To this end, we recall a special case of Hoeff ding's inequality fTlll : 

Proposition 4.2. Let X\, X 2 , ■ ■ ■ , Xn be independent random variables with zero means and bounded 
ranges a~ < Xi < af, i = 1,2, ... ,N. Then, for any 77 > 0, there holds the probability bound 

P(\X 1 +X 2 + ... + X N \ >rf) <2cxp 

In order to apply the previous result, we define, for i = 1, 2, . . . , N, the random variables 

X % = - lQ uJ C^ 1 A)^ - log( 70 )tr(A) - <S(A), 

where w, € fi m are random vectors with entries ±1 appearing with equal probability of 
Using (4), we conclude that 

X t = 70 [-uj C{^ x A)^ + trOC^A))] . 

According to Proposition ^. II we have that 

E(X i ) = 0, i = 1,2,..., N, 

provided that tr(£(A)) 7^ 0. Furthermore, we have that 

Xi = 70 [-uJ Pn {% l A)ui + tr(£( 7o - 1 A))] + TbwT [-C^ 1 A) + p^ 1 A)} u u 



( -v \ 
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and thus, with the aid of Corollary l3.3l 



with 



for i = 1, 2, . . . , N. Then, setting 
and 

(13) 6 = a 

we obtain the uniform bounds 



mx "fo 



a - amin = max -~f Q ujJ p rl (~/ 1 A)uj 1 - min -~f u}Jp n ('j 1 A)u} l - T - 

l<i<N l<i<N 7l[n "T 1 J 



"min < Xi < a max , 1 < i < N, 



and hence, by Hoeffding's inequality, Proposition 12.11 we find, for any ?; > 0, the probability 
estimate 



P 



(14) 



1 N 



- N 



To 
" N 



N 



- log( 7o )tr(A) - <£(A) 



> 



N 



< 2cxp (-2NS~ 2 (V^v) 2 ) • 
Using the approximation l(l2l . and recalling again Corollary 13 .31 results in 



£{A) - <£(A] 



To 
~N 



N 



ujvn (To ±A ) 



< 



N 



log( 7o )tr(A) - <S(A) 



JV 



< 



-^E«7(p»(7o l A)-r( 7o - 1 A)) Wj 

To 
'TV 

mx 7o 



w 7£(7o 



- log(7o)tr(A) - <£(A) 



2n(n+l) 
Thus, with (fT4t , we obtain 



To 
"TV 



AT 



log( 7o )tr(A) - (£(A) 



P 



£(A) - 2(A) I > ^ 



N 2n(n + l) 
^a; l T £(7 - 1 A)a; 

i=l 

< 2exp(-2iVcr 2 (V^v) 2 ) , 



< P 



To 
'AT 



AT 



log( 7o )tr(A) - <£(A) 



> 



N 



and therefore, 



(15) 



P 



<s(A) - 



< 



N 2n(n + 1) 

Now, fixing an error tolerance r > 0, we select v/n > such that 



> l-2cxp(-2iV ( 5" 2 (VA r ) 2 



_77_ _ _ mx 7o 
N ~ T 2n(n + 1) ' 



Hence, 



P ( <£(A) - <£(A) < t) > 1 - 2 cxp ( -2NS- 2 (t 



2n(ra + l) 
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We thus have proved the following result: 

Theorem 4.3. Consider a real symmetric, positive semidefinite matrix A e K mxm , and constants 
70, xq > such that a (A) C [0, 70^0] • Moreover, let t > Obe a prescribed error tolerance, and n G N a 
polynomial degree such that 

(16) > 2^TT)' 

T/zen, computing N <E N sample vectors 

wiefi ro , i= 1,2,. ..,7V, 
wif/? entries ±1 of equal probability the output of Algorithm \4.1\ denoted by <£(A), satisfies 



<£(A) - <£(A) 



< 7", 



(17) 

wif/? probability at least 



2n{n + l) / 



(18) p = 1 - 2 exp ^-2Nr 

Here, 5 = a max — a m ; n zs defined in (fl~3l >. 

Remark 4.4. The above theorem shows that, in order to achieve a certain prescribed accuracy r 
in the computations, the polynomial degree n of p n from |8]l needs to be sufficiently large in 
accordance with Q6J >. In addition, we see that the probability p of satisfying the error estimate lfl7t 
can be increased by adding more samples in the Monte-Carlo approach. In addition, from fTHl , it 
follows that 

1 2 ( _ mx 7o \~ ,__ ( 2 



iV = -<5 I r — n . log 



2 V 2n(n + l)/ b \l - p , 
Therefore, noticing that S = 0(m) (cf. Corollary l2.3l l may imply that the theorem could require N 
to be unfeasibly large. Consequently, again following [1, Section 4.2], it may often be more prac- 
tical to fix the number N of samples, or in this paper, a polynomial degree n beforehand, and to 
provide an error bound for a given probability p. Indeed, with p from ([18) we solve for r to arrive 
at 

(19) r- mXol ° 




2ra(n+l) V 2N b \l-p / 

where we have obeyed ((T6l l in choosing the sign in front of the square root. We notice that r is a 
sum of two independent error contributions. Thus, for given polynomial degree n it is reasonable 
to choose the number of samples N such that 

mxo^o _ 




2n(n + 1 

i.e., 

2n 2 (n + l)^ 2 log( T ^ 



(20) N - 

m l x%% 

This observation leads to Algorithm 14. 5l below. 

Algorithm 4.5. Let A G R mxm be a real symmetric, positive semidefinite matrix, and n eNa prescribed 
polynomial degree. Furthermore, choose xq, 70 > with a(A) c [0, xo7o]/ and a probability p <G (0, 1). 
Then: 

(1) Set i = 0, N = 1, £ max = -00, £ min = 00. 

(2) While i < N do 

• i = i + 1. 

• Find a random vector u>i G f2 m with entries ±1 occurring with the same probability X J%. 

• Determine the scalar & = j uij Pn^Q 1 A)u)i using Algorithm WM 
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• Compute 

• Find 



Cmin — m i n (£minj £max — m & x (Cmax) fi) 

mxojo 



max smm 



n(n + 1) ' 



and N from | |20|| , 
End do. 

(3) Output the approximate entropy 



i N 



and the error tolerance from (IT9l l. 

Remark 4.6. In accordance with Remark 12.41 it is sensible to choose xq = 1 and 70 = A max , 
where A max > is an upper bound on the spectrum a (A). Incidentally, A max could be deter- 
mined, for example, by means of the Gerschgorin circle theorem. 

5. Examples 

We shall now illustrate the method developed in this paper by means of two examples. 

5.1. A Finite Element Matrix. We consider the classical stiffness matrix 

/ 2 -1 •■■ \ 



-1 


Vo 



2 -1 







-1 2 -1 
0-12/ 



which appears in the discretization of the one-dimensional boundary value problem 

-u"(x)=f(x), xe(0,l), u(0) = u(l) = 0, 

by uniform linear finite element; see, e.g., fl"3l Chap. 1]. It can be shown that the eigenvalues of A 
are given by 



A, = 4 sin 



and therefore, 



<B{A) = -4^ si 



sin 



2m + 2 



in 



1 < i < m, 



,2m 

i=i 

4(2m + 2) r /2 
Jo 



log I 4 sin 



2m 



TC 



jin 2 (x) log (4sin 2 (x)) dx. 



Now, using the fact that J^ 2 sin 2 (x) log (4 sin 2 (x)) dx = ^/i, we see that <B(A) w —2m. In partic- 
ular, the entropy decreases asymptotically linearly as m —> 00. 

In Table[T]we present numerical results for a prescribed probability p = 0.95 and several poly- 
nomial degrees n. The latter quantity has been chosen 'by hand' with moderate growth as the 
matrix size m is increasing. We clearly see that the algorithm generates quite accurate results 
already for a low number of samples. Indeed, the relative errors are (except for in = 10) below 
1%, and the computed errors based on (fl9t are very reasonable as compared to the magnitude of 
the exact entropy. 
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matrix 


polyn. 


number of 


exact 


abs. err. 


rel. err. 


comput. 


size to 


deg. n 


samples N 


entropy 






err. 


10 


2 


21 


-19.232 


0.21265 


1.1057% 


6.6293 


50 


3 


33 


-99.228 


0.7396 


0.7453% 


16.587 


100 


3 


37 


-199.23 


0.0749 


0.0376% 


33.149 


500 


4 


18 


-999.23 


2.2705 


0.2272% 


98.905 


1000 


6 


35 


-1999.2 


1.1163 


0.0558% 


94.559 


5000 


8 


15 


-9999.2 


7.4981 


0.0750% 


275.54 



TABLE 1 . Entropy of a finite element matrix for various sizes to and p = 0.95. 



5.2. An Application in Quantum-Optics. Entangled photons have become a widely used non- 
classical light source to investigate fundamental aspects of entanglement Il20l l8l. Their unique 
properties have further paved the way to potentially practical applications in quantum communi- 
cation and quantum computing fl5l l9l. In recent years spontaneous parametric down-conversion 
(SPDC) has become the standard procedure to generate entangled photon states. SPDC occurs 
when a noncentrosymmetric crystal is pumped by a laser beam strong enough to induce nonlin- 
ear interactions. In this case, a pump photon with angular frequency lu p may be annihilated and 
two new photons of lower frequencies w, and u> s , denoted as the idler and the signal, are created. 
Energy conservation always demands coj +oj s =u) p . If the experimental configuration of the three 
involved photons is further restricted to the case where they propagate collinearly the resulting 
two-photon state, given by, 

(21) |*) = |0) + / Auj l [ dLu s f(u it u a ) at(ui)&t(u a ) |0), 



describes entanglement in the frequency domain iTTil . We consider here identically polarized 
photon states created by the action of aj(ojj), j £ {i, s}, on the combined vacuum state |0) = 
\0i) ® |0 S ). The state in (EB is an entangled state if the joint spectral amplitude 

n ~ ,, s ( (w 4 + w s -w cp ) 2 T 2 \ _ / Afc(wi,w,)L 
(22) f(Ui,u) s ) oc exp — — — y - sine ' 



81og(2) J V 2 

cannot be separated into a product /(cjj,w s ) = g(uJi)h(uj s ). The pump pulse with center fre- 
quency ui cp is represented by the exponential term in ll22Tl and its duration is given by t p . The 
parameter L denotes the length of the crystal. The efficiency of the SPDC process is dominated 
by Afc(wi,w s ) = ki{uji) + k s (uj s ) — k p {uji + u) s ) + ^r, where kj(uj) is the frequency-dependent 
propagation constant of a periodically poled crystal with poling period G. Using the correspond- 
ing Schmidt decomposition, the amount of entanglement in |2TJ can now be quantified by the 
entropy (|U of either idler or signal subsystem [4J. The state of each subsystem is described by its 
corresponding continuous density matrix, given by, 



Ai(uj,uj') = j du s f(u),u) s )f*(u)',oj 8 ), 
A 8 (l),w') = / dwj f(u)i,u})f*(ui,u;'). 



Due to the symmetry of f(tui, w s ) in (f22l l we define Ai(u, a/) = A s (lj, oj') = A(w, w') G R. In order 
to calculate ((4), the continuous function A(oj, w') has to be discretized on a lattice, i.e. A(u>, u>') —> 
A £ ]^jnxm^ w j m ^4 _ g mce a is the density operator of a physical state, its eigenvalues 
are further distributed such that A > for all A G p{A). For short pump pulses, where r p 
is of the order of fs, the exact <B(A) can be calculated by means of cr(A) since only small grid 
sizes (to rj 800) are needed to sufficiently resolve A(u>,ui') f!6l . Unfortunately, the grid sizes 
required to discretize A(oj,lu') for long pump pulses, e.g., for r p on a timescale of ns, are very 
large since in this case / (cjj , oj s ) is dominated by a narrow Gaussian function. Diagonalization of 
A is then practically unfeasible. However, a numerical approximation of the entropy according 
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FIGURE 3. Quantum optics application, for matrix size m = 0.72 • 10 s : Approx- 
imate entropy &(A/tr(A)) based on Algorithm 4.5 with p = 0.95 for various 
polynomial degrees n; the vertical bars indicate the computational error ranges 
according to l(19l . 

to Algorithm 4.5 is still possible. We have calculated the entropy <£(A/ti(A)) for a pump pulse 
duration of t v = 8.8 ns and a L = 11.5 mm long potassium titanyl phosphate crystal with G = 
9.013 fim. In the frequency domain, this specific choice of t p corresponds to a pump pulse with 
a narrow spectral bandwidth of 5 MHz. Notice, that the normalization A/ti(A) results from 
the fact that a physical state needs to be normalized; indeed, since tr(A) = J2\ea(A) ^ we have 
that tr(A/tr(A)) = 1. In order to save memory space we made use of the fact that only a small 
amount of entries in A are significantly nonzero which allows to store the matrix in sparse format. 
This procedure results in a band matrix with m ~ 0.72 • 10 s and 37 diagonals. Figure |3] shows 
convergence of <B(A/tr(A)) and the error tolerance from {19}. For polynomial degree n = 20 and 
error probability p = 0.95 we obtain (£(A/ti(A)) = 14.934 ± 0.127. Up to a polynomial degree of 
?i = 20, the discretization error for m = 0.72 • 10 8 is still smaller than the computational error r. 
For all ?i the number N of sample vectors is N = 8. Our computations where performed in 
MatlafJ] on a 12 core Intel Xeon X5650 (2.66 GHz) processor with 96 GB RAM. It is remarkable 
that for a matrix size m = 0.72 • 10 8 the computational time for the entropy only took about 25 
minutes. This clearly underlines the high efficiency of Algorithm 4.5 for this example. In the case 
of a maximally entangled, discrete bipartite system of finite dimension m 2 , the entropy increases 
according to £ = O(log(m)). Due to t v being of the order of ns the state under consideration 
exhibits a very high degree of entanglement and is therefore almost equivalent to a maximally 
entangled system with m ~ exp(14.934). 

6. Conclusions 

In this article, we have derived a new algorithm for the computation of the entropy of a large 
real symmetric, positive semidefinite matrix. The proposed procedure does neither require the 
computation of the spectrum nor of the matrix logarithm. Indeed, it is based on the following 
two main ideas: 

• Approximation of the 'entropy function' £ by a reasonably accurate Chebyshev polyno- 
mial. 



MATLAB is a trademark of The Math Works, inc. 
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• Computation of the entropy by combining a Monte-Carlo type sampling procedure and 
a Clenshaw algorithm for matrix polynomials. 

The new algorithm is parallelizable and straightforward to implement. It was tested for a classi- 
cal finite element matrix as well as for a large discretization matrix originating from a quantum- 
optics application. In both cases, our algorithm is able to achieve accurate results in a very effi- 
cient way. 

An interesting extension of this research constitutes the computation of the matrix entropy in 
the context of complex discrete Hermitean operators. Here, an important ingredient is the appro- 
priate redefinition of the function C for complex input values and the corresponding approxima- 
tion by polynomial functions for both the real as well as the imaginary part. 
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